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Abstract 

When a neuron fires and the resulting action potential travels down its 
axon toward other neurons' dendrites, the effect on each of those neurons 
is mediated by the weight of the synapse that separates it from the firing 
neuron. This weight, in turn, is affected by the postsynaptic neuron's 
response through a mechanism that is thought to underlie important pro- 
cesses such as learning and memory. Although of difficult quantification, 
cortical synaptic weights have been found to obey a long-tailed unimodal 
distribution peaking near the lowest values, thus confirming some of the 
predictive models built previously. These models are all causally local, in 
the sense that they refer to the situation in which a number of neurons all 
fire directly at the same postsynaptic neuron. Consequently, they neces- 
sarily embody assumptions regarding the generation of action potentials 
by the presynaptic neurons that have little biological interpretability. In 
this letter we introduce a network model of large groups of interconnected 
neurons and demonstrate, making none of the assumptions that charac- 
terize the causally local models, that its long-term behavior gives rise to 
a distribution of synaptic weights with the same properties that were ex- 
perimentally observed. In our model the action potentials that create a 
neuron's input are, ultimately, the product of network-wide causal chains 
relating what happens at a neuron to the firings of others. Our model 
is then of a causally global nature and predicates the emergence of the 
synaptic-weight distribution on network structure and function. As such, 
it has the potential to become instrumental also in the study of other 
emergent cortical phenomena. 
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The weight of a synapse between a neuron's axon and another's dendrite is 
generally understood to be some measure of how influential an action potential 
fired by the presynaptic neuron can be on the buildup of a such a potential in 
the postsynaptic neuron. While the physical entities whose measurement can 
be said to relate to synaptic weights are various [121 1301 HH HI] , recent exper- 
imental work involving measurements of the excitatory postsynaptic potential 
amplitude has revealed that synaptic weights follow a long-tailed distribution 
that is unimodal and peaks near the lowest voltage values [M] . Understanding 
the processes that give rise to a distribution with these properties can be greatly 
enhanced by the construction of mathematical models that take into account 
the nature of each neuron involved (excitatory or inhibitory), the nature of a 
synapse's plasticity in terms of how its weight changes in response to inter- 
neuron signaling, and also the distribution of firings in time. Predictive models 
have been built with varying degrees of success [23l ISOl |22] , the most successful 
ones drawing on relatively well established knowledge regarding the proportion 
of inhibitory neurons to be used and the rule to change synaptic weights |30j . 

Invariably, though, these models have relied on examining one single post- 
synaptic neuron toward which firing patterns are directed that in essence seek 
to summarize the entire input history of the postsynaptic neuron by a simple 
stochastic process. Arguably this history is one of the most important elements 
in giving rise to the synaptic- weight distribution in a way that can be understood 
biologically [7], but in all current models there is no choice but to summarize 
it beyond retrieval. This happens because the models are all strictly local, al- 
lowing for no causal dependency between what happens at two neurons unless 
they are no farther apart from each other than one single synapse. The model 
we now introduce addresses this severe shortcoming by combining a network 
structure and algorithm with the proven mathematical elements of the previous 
models. 

The new model has a structural component and an algorithmic one. The 
structural component is a directed graph D whose nodes correspond to neurons 
that can be either excitatory or inhibitory. For i and j two distinct nodes such 
that at least one of them is excitatory, an edge directed from i to j represents a 
synapse with associated weight Wij . No edge exists between two inhibitory nodes 
[2] . The algorithmic component turns each node in D into a simple simulator of 
the corresponding neuron, employing message passing on the edges along their 
directions to simulate the signaling through the corresponding synapses when 
nodes fire. Collectively, the nodes behave as an asynchronous distributed algo- 
rithm [5], here referred to as A, each executing a simple procedure P whenever 
receiving a message, possibly sending messages itself while executing P but re- 
maining idle at all other times. Because nodes only do any processing in this 
reactive manner, at least one node is needed that initially executes P once with- 
out any incoming message to respond to and then starts behaving reactively 
like the others. We call such a node an initiator. 

At node j, let Vj stand for the node's potential. Let also and w* be 
a node's rest potential and threshold potential, respectively, the same for all 
nodes. The effect of running P is for j to probabilistically decide whether to 
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fire and, if it does fire, to send messages on all outgoing edges while setting vj 
to v^. If P is run as the initial processing by an initiator, then the firing occurs 
with probability 1 and P involves no actions other than the ones just described. 
If not, then let i be the sender of the triggering message. The firing occurs 
with probability min{l, {vj — — after vj has been updated to either 

Vj + Wij (if i is excitatory) or Vj — Wij (if i is inhibitory) . Then the weight Wij 
is considered for an update. 

The updating of Wij seeks to mimic the commonly accepted generalization of 
the Hebbian rule embodied in the spike-timing-dependent plasticity principles 
[UllS], according to which the change incurred by a synapse's weight depends on 
the extent to which there is a causal dependency of what happens at a neuron 
upon the other's firing. As a general rule, the synaptic weight is increased 
(potentiated) if the postsynaptic neuron fires in response to the firing by the 
presynaptic neuron, decreased (depressed) otherwise. In either case the amount 
of change to the synaptic weight depends on how close in time the relevant 
firings are, becoming negligible with increasing separation. Procedure P follows 
these principles by keeping track of the latest firing by j so that a decision can 
be made on whether to increase or decrease Wij . If j does fire in response to the 
message received from i, then Wij is increased. If it does not but the previous 
message received from any source did cause j to fire, then Wij is decreased. 
The weight Wij remains unchanged in all other cases. The actual amount of 
change to Wij depends on whether it is to be increased or decreased, and so 
does the nature of the change (by a fixed amount or by proportion) [9l [TOl [18] . 
An increase in wtj is implemented by setting Wij to min{l, Wij + 6} with S > 0, 
a decrease by setting Wij to (1 — a)wy with < a < 1, thus ensuring that 
synaptic weights remain in the [0, 1] interval if so started. 

Running algorithm A starts with choosing one or more initiators, each of 
which executes P and then starts behaving like all other nodes. At any time it 
may happen that a node has more than one input message to process, in which 
case the order in which they are taken is the order of message reception. Be- 
cause this order is in principle arbitrary, A is seen to acquire another degree of 
indeterminacy, in addition to that which is already present owing to the proba- 
bilistic decisions. We have conducted extensive computational experimentation 
with A on a graph D intended to model a simple cortex, in line with significant 
recent work that draws on the theory of graphs to help solve problems in neu- 
roscience [JSllSIllSllHllIllIIllEllESlEH]. We regard D as a random graph but, 
unlike some of the early work on cortical modeling by such graphs [2], where 
fully random graphs [13] were used, we let D have a scale- free structure [TO] , 
with parameter as suggested by some of the more recent finds [T2j [29] . Thus, 
a randomly chosen node i in D has k outgoing edges with probability propor- 
tional to k~^'^. Moreover, inspired by recent work on the modeling of cortical 
systems [T7J [TB] , we let each outgoing edge of i lead to another randomly chosen 
node j with probability proportional to e~^'^, where d is the Euclidean distance 
between i and j when the nodes of D are placed uniformly at random on a 
radius-1 sphere (Figure [T]), provided i and j are not both inhibitory. 
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Figure 1: Network topology. Restricted to two dimensions for visual clarity, a 
D instance comprises nodes positioned randomly on a radius- 1 circle and edges, 
drawn as chords of the circle, that tend to be more abundant over lower Eu- 
clidean distances. Excitatory nodes are represented by filled circles, inhibitory 
nodes by empty circles. 

All computational experiments have adhered to the methods described next, 
which refer to sequences of 10 000 runs of algorithm A. The first run in a sequence 
operates on initial node potentials and synaptic weights chosen randomly from 
the intervals [u°,f;*] and [0,1], respectively, with u° = —15 and = 0. Each 
subsequent run operates on the potentials and weights left by the previous run. 
For n the number of nodes in Z?, a new set of 0.05n initiators is chosen randomly 
at the beginning of each run. A run of A is implemented as a sequential program 
that selects the next node to be processed randomly (first out of the group of 
initiators for their first executions of P, then out of those nodes that have at 
least one message to be received). A new run in a sequence is only started 
after the previous one has died out (no more messages to be processed remain), 
which is guaranteed to happen eventually with probability 1. The remaining 
parameters used by procedure P are S — 0.01 and a — 0.05. All our results refer 
to 50 000 independent sequences, of which each 500 sequences correspond to a 
new D instance. A D instance is constructed by first placing all nodes uniformly 
at random on a radius-1 sphere, then selecting the number of outgoing edges 
for each node. Nodes are then chosen to be excitatory or inhibitory randomly, 
provided a certain proportion is respected, and the destination of each edge 
is decided. The graph that is actually used in the run sequences is the giant 
strongly connected component of D , so a directed path exists from any node 
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to any other. For the connectivity distribution and construction method in use 
this component comprises about 0.95n nodes on average. 

Our results, here given for n = 1 000 and the weU accepted proportion of 

0. 2n inhibitory nodes [2113, show that the synaptic- weight distribution becomes 
analogous to the distribution unveiled by experimentation along the sequences of 
runs of algorithm A described above (Figure [2]). The process is gradual, leading 
the weights to become relatively concentrated around a single low-value mode 
while still allowing some residual probability to remain at the higher values. 
The long-term distribution is seen to stabilize even as the weights continue to 
evolve, thus suggesting the existence of an underlying weight dynamics whose 
effect on the overall distribution is nevertheless practically imperceptible. The 
existence of this persistent dynamics is revealed by the causal history of each 
terminal message reception (one that does not lead to the firing of the receiver), 
which can be significantly deep with respect to the relatively short average path 
of a scale- free network [20] [Figure ^a)] . The sending of every message by a 
non-initiator causes a synaptic weight to be increased, unless it already equals 

1, but weight- 1 synapses are very rare, especially when arranged as a path in 
D. So the causal histories we have discovered do indeed hint at the existence 
of a dynamics of weight evolution in which weights both increase and decrease 
in complex patterns. Additional confirmation is provided by the average weight 
of the synapses involved in the causal histories of terminal message receptions, 
which is consistently less than 1 and also decreases throughout the runs as the 
synaptic- weight distribution settles [Figure [Sj^b)]. 

Every run in the sequences to which Figures [2] and [3] refer involves a new 
group of initiators and as such provides new possibilities regarding the branching 
of causal histories and how they affect firings and weight changes throughout 
the network. Monitoring the traffic of messages as they traverse edges and 
reach nodes is then a means to do some quantification of how the cascading 
runs, with their intermingling causal trees rooted at many different initiators, 
cooperate in promoting the emergence of the synaptic-weight distribution. We 
have found that the long-term distributions of how many runs traverse an edge 
or reach a node (Figure 2]) , allowing as they do for relatively high numbers 
with significant probabilities, suggest that some sort of information integration 
is taking place among portions of the network as the runs unfold. Perhaps such 
integration occurs in a sense similar to that which has been theorized recently 
regarding the emergence of higher functions such as consciousness [5]. If so, 
then network algorithmics such as we have discussed may come to provide a 
powerful framework to test the assumptions and eventual predictions of such 
theories. 
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Figure 2: The synaptic- weight distribution, shown after selected runs of algo- 
rithm A. Probabilities are binned to a fixed width of 0.01. 
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Figure 3: Causal depth of a message reception and associated synaptic weights. 
The causal depth of a message reception is the size of its causal history, i.e., 
the number of firings that precede it along the chain of firings that begins at 
some initiator when it fires for the first time, each preceding the next by direct 
causation: given any two subsequent firings in this chain, the first entails the 
sending of a message whose reception triggers the second, (a) Maximum and 
average causal depth of terminal message receptions during the course of each 
run. (b) Average weight (before updates) of the synapses involved in the causal 
histories of terminal message receptions. 
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Figure 4: Final distributions of the number of runs in which an edge is traversed 
or a node is reached. An edge is said to be traversed in a run when at least 
one message is sent along it during the course of that run. A node is said to be 
reached in a run when it receives at least one message during the course of that 
run. Probabilities are binned to a fixed width of 50 for edges, 100 for nodes. 
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